\subsubsection{ARM + \OptimizingXcodeIV (\ARMMode)}

\lstinputlisting[caption=\OptimizingXcodeIV (\ARMMode),label=ARM_leaf_example4,style=customasmARM]{patterns/14_bitfields/4_popcnt/ARM_Xcode_O3_EN.lst}

\myindex{ARM!\Instructions!TST}
\TST is the same things as \TEST in x86.

\myindex{ARM!Optional operators!LSL}
\myindex{ARM!Optional operators!LSR}
\myindex{ARM!Optional operators!ASR}
\myindex{ARM!Optional operators!ROR}
\myindex{ARM!Optional operators!RRX}
\myindex{ARM!\Instructions!MOV}
\myindex{ARM!\Instructions!TST}
\myindex{ARM!\Instructions!CMP}
\myindex{ARM!\Instructions!ADD}
\myindex{ARM!\Instructions!SUB}
\myindex{ARM!\Instructions!RSB}
As was noted before~(\myref{shifts_in_ARM_mode}),
there are no separate shifting instructions in ARM mode.
However, there are modifiers 
LSL (\IT{Logical Shift Left}), 
LSR (\IT{Logical Shift Right}), 
ASR (\IT{Arithmetic Shift Right}), 
ROR (\IT{Rotate Right}) and
RRX (\IT{Rotate Right with Extend}), which may be added to such instructions as \MOV, \TST,
\CMP, \ADD, \SUB, \RSB\footnote{\DataProcessingInstructionsFootNote}.

These modificators define how to shift the second operand and by how many bits.

\myindex{ARM!\Instructions!TST}
\myindex{ARM!Optional operators!LSL}
Thus the \TT{\q{TST R1, R2,LSL R3}} instruction works here as $R1 \land (R2 \ll R3)$.

\subsubsection{ARM + \OptimizingXcodeIV (\ThumbTwoMode)}

\myindex{ARM!\Instructions!LSL.W}
\myindex{ARM!\Instructions!LSL}
Almost the same, but here are two \INS{LSL.W}/\TST instructions are used instead of a single \TST, because in Thumb mode it is not
possible to define \LSL modifier directly in \TST.

\begin{lstlisting}[label=ARM_leaf_example5,style=customasmARM]
                MOV             R1, R0
                MOVS            R0, #0
                MOV.W           R9, #1
                MOVS            R3, #0
loc_2F7A
                LSL.W           R2, R9, R3
                TST             R2, R1
                ADD.W           R3, R3, #1
                IT NE
                ADDNE           R0, #1
                CMP             R3, #32
                BNE             loc_2F7A
                BX              LR
\end{lstlisting}

\subsubsection{ARM64 + \Optimizing GCC 4.9}

Let's take the 64-bit example which has been already used: \myref{popcnt_x64_example}.

\lstinputlisting[caption=\Optimizing GCC (Linaro) 4.8,style=customasmARM]{patterns/14_bitfields/4_popcnt/ARM64_GCC_O3_EN.s}

The result is very similar to what GCC generates for x64: \myref{shifts64_GCC_O3}.

\myindex{ARM!\Instructions!CSEL}
The \CSEL instruction is \q{Conditional SELect}. 
It just chooses one variable of two depending on the flags set by \TST and copies the value into \RegW{2}, which holds the \q{rt} variable.

\subsubsection{ARM64 + \NonOptimizing GCC 4.9}

And again, we'll work on the 64-bit example which was already used: \myref{popcnt_x64_example}.
The code is more verbose, as usual.

\lstinputlisting[caption=\NonOptimizing GCC (Linaro) 4.8,style=customasmARM]{patterns/14_bitfields/4_popcnt/ARM64_GCC_O0_EN.s}

